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ABSTRACT 

If one accounts for correlations between scales, then nonlocal, ^-dependent halo 
bias is part and parcel of the excursion set approach, and hence of halo model 
predictions for galaxy bias. We present an analysis that distinguishes between 
a number of different effects, each one of which contributes to scale-dependent 
bias in real space. We show how to isolate these effects and remove the scale de- 
pendence, order by order, by cross-correlating the halo field with suitably trans- 
formed versions of the mass field. These transformations may be thought of as 
simple one-point, two-scale measurements that allow one to estimate quantities 
which are usually constrained using n-point statistics. As part of our analysis, 
we present a simple analytic approximation for the first crossing distribution 
of walks with correlated steps which are constrained to pass through a specified 
point, and demonstrate its accuracy. Although we concentrate on nonlinear, non- 
local bias with respect to a Gaussian random field, we show how to generalize 
our analysis to more general fields. 

Key words: large-scale structure of Universe 



1 INTRODUCTION 

Galaxy clustering depends on galaxy type (Zehavi et al. 
2011 and references therein). Therefore, not all galaxies 
are fair tracers of the dark matter distribution. Precise 
constraints on cosmological models require a good under- 
standing of this galaxy bias (Sefusatti et al. 2006; More et 
al. 2012). In the simplest models, galaxies are linearly bi- 
ased tracers (Kaiser 1984), but, even at the linear level, 
this bias may depend on physical scale or wavenumber 
k (e.g. Desjacques et al. 2010; Matsubara 2011). This 
scale-dependence, which is clearly detected in simulations 
of hierarchical clustering models (Sheth & Tormen 1999; 
Smith et al. 2007; Manera et al. 2010), contains important 
information about the statistics of the initial fluctuation 
field, and the nature of gravity (Parfrey, Hui & Sheth 
2011; Lam & Li 2012). 

The most common galaxy bias model - the local bias 
model - assumes that the galaxy overdensity field 5h{x) 
is a local, possibly nonlinear, monotonic, deterministic 
transformation of the dark matter field 5{x) at the same 
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position (Fry & Gaztahaga 1993; Manera & Gaztanaga 
2012; Pollack, Smith & Porciani 2012; Chan & Scocci- 
marro 2012). Even in this case, there are a number of 
ways in which scale dependence can arise, even for the 
simplest case of Gaussian initial conditions and standard 
gravity. Since the measured bias will generally be a com- 
bination of all these effects, we present some ideas on how 
to disentagle them from one another. 

In general, of course, 5^ might depend on the value 
of 6 at different locations, on its derivatives (Desjacques 
et al. 2010; Musso & Sheth 2012), on other higher or- 
der statistics of the field (e.g. Sheth, Mo & Tormen 2001; 
Sheth, Chan & Scoccimarro 2012) at the same or at dif- 
ferent positions, etc.; the dependence might even not be 
deterministic (e.g., Sheth & Lemson 1999; Dekel & Lahav 
1999). Our final goal is to present methods which are able 
to pinpoint this relation even when the bias is nonlinear, 
nonlocal and stochastic. 

We study insights which arise from the simplest 
treatment of halo bias: that based on the excursion set 
approach (Press & Schechter 1974). This approach maps 
the problem of counting the number of collapsed halos to 
that of the first crossing of a suitable threshold (the 'bar- 
rier') by random walks in density generated by smoothing 
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the initial matter density field using a sequence of filters 
of decreasing scales (Bond et al. 1991). In addition to 
depending on the 'barrier' shape, the first crossing dis- 
tribution also depends on how far from the 'origin' the 
walks happen to be for the largest smoothing scale 5*0. 

Walks that do not start from the origin have modi- 
fied first crossing distributions (Lacey & Cole 1993) . This 
introduces a dependence of the abundance of halos 1 -I- 8h 
on the initial matter density field 5 smoothed on the much 
larger scale So, and hence leads to a prediction for halo 
bias (Mo & White 1996). 

The excursion set approach greatly simplifies when 
the smoothing filter is sharp in Fourier space, because 
in this case the steps in each walk are uncorrelated with 
each other. Since most analyses to date have relied on this 
choice, we use it to illustrate many of our key points. E.g., 
if the bias is deterministic and nonlinear in real-space, it 
will be stochastic in fc-space. And, estimates of cross- 
correlations between the halo and mass fields depend on 
the assumed form of the probability distribution function 
of the mass: one must be careful to use the appropriate 
probability density function (pdf). One of the key insights 
of this paper is to show that suitably defined real-space 
cross-correlation measurements allow one to extract the 
different bias coefficients, order by order. 

Recently, however, there has been renewed interest 
in studying the effects of smoothing with more realistic 
filters such as the TopHat in real space or the Gaussian. 
The problem is complicated in this case by the presence 
of nontrivial correlations between the steps of the ran- 
dom walks (Peacock & Heavens 1990; Bond et al. 1991), 
and a number of different approximations for the effect on 
the first crossing distribution have been introduced (Mag- 
giore & Riotto 2010; Paranjape, Lam & Sheth 2012). We 
show that the most accurate of these, due to Musso & 
Sheth (2012), can be extended to provide a very accurate 
model for walks which do not start from the origin. 

We then show that correlations between steps generi- 
cally introduce two additional sources of scale-dependent 
bias into the predictions. One is relatively benign, and 
simply arises from the fact that the excursion set predic- 
tion is for a real-space quantity, but the halo bias in A'^- 
body simulations is typically measured in Fourier space, 
through ratios of power spectra. That this matters reit- 
erates a point first made by Paranjape & Sheth (2012), 
but it is easily accounted for by using a more appro- 
priate normalization of the bias coefficients. The second 
is more pernicious and is a genuinely new source of k- 
dependent bias (a point made in Musso & Sheth 2012, 
but not studied further). Although this complicates dis- 
cussion of scale-dependent bias, our method of measur- 
ing suitably defined real-space cross-correlations between 
the halo and mass fields can be used to extract the k- 
dependence of halo bias order by order. 

This paper is organised as follows. Section [2] briefly 
summarizes known excursion set results for uncorrelated 
steps, defines the halo bias factors as a ratio of real space 
measurements, derives their large scale limiting values, 
uses these to motivate a real-space cross-correlation mea- 



surement at finite scale which returns these limiting val- 
ues, and quantifies the importance of computing averages 
over the correct ensemble. 

Section [3] extends these results to the case of cor- 
related steps. We first derive the conditional distribution 
/(s|5o, So) that a walk crosses the barrier for the first time 
at scale s having taken up the value 5o at scale So, and 
demonstrate its accuracy by comparing with the results 
of a Monte Carlo treatment of the problem. We then turn 
to the problem of halo bias, and highlight some impor- 
tant differences from the uncorrelated case: the question 
of the correct pdf is shown to be much less important, 
whereas the scale dependence of bias becomes more dra- 
matic. We discuss some of the implications of our analysis 
and conclude in section ID Appendix |X] collects proofs of 
some results quoted in the text, while Appendix |B] con- 
nects the bias coefficients defined using cross-correlation 
measurements to other definitions in the literature. 

Throughout we will present results for a constant 
barrier of height Sc. Moving barriers pose no conceptual 
difficulty for the first crossing distributions we are inter- 
ested in. Also, while our analytical results are generally 
valid for any smoothing filter and power spectrum, for 
ease of implementation, the explicit comparisons with nu- 
merical solutions will use the Gaussian filter and a power 
law power spectrum. Again, we do not expect our final 
conclusions to depend on this choice. 



2 THE EXCURSION SET APPROACH: 
UNCORRELATED STEPS 

The excursion set ansatz relates the number of halos in 
a mass range (m,m -I- dm) to the fraction /(s) of walks 
that first cross the barrier in the scale range (s,s -I- As) 
through the relation 

!I^^dm = /(.)d., (1) 
p dm ^ w : \ J 

where s = s{m) = ( (5^(m) ) is the variance of the mat- 
ter density field smoothed on a Lagrangian length scale 
corresponding to mass m and linearly extrapolated to 
present day, and p is the background density. 

In this approach, the influence of the underlying dark 
matter fleld on the abundance of halos of meiss m (i.e. the 
bias) can be estimated from the fraction /(s|(5o,So) of 
walks that first cross the barrier at s starting from some 
prescribed height So on some prescribed scale So, rather 
than from the origin (Mo & White 1996; Sheth & Tor- 
men 1999). The mean number overdensity of halos can 
be defined as 

(l + ^,|^o,So)^ '^^f;f \ (2) 

which is explicitly a prediction in real space, and valid on 
scale So in the Lagrangian initial conditions. 

Typically, the bias is characterised by expanding the 
above expression in powers of Sq. The coefficients of this 
expansion will in general depend on So (besides obvi- 
ously depending on s). Moreover, the evaluation of /(s) 
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and /(s|5o,'S'o) (and therefore of the bias coefRcients) is 
rather different depending on whether or not the steps 
in the walk are correlated. In what follows, we elucidate 
the issue of the scale dependence of the bias coefficients in 
the simpler case of walks with uncorrelated steps. We also 
argue that the same coefficients can be obtained as the 
mean value of the product of { 1 + i5h|(5o, So ) and poly- 
nomials in So, weighted by the probability distribution of 
So - This alternative definition as an expectation value will 
be more suitable to be extended to the case of correlated 
steps (section [3]), and to make contact with the defini- 
tion of bias in generic models other than the excursion 
set approach (Appendix IB)). 

2.1 Large scale Lagrangian bias factors 

The conditional first crossing distribution of a constant 
barrier 5^ for walks with uncorrelated steps is (Bond et 
al. 1991; Lacey & Cole 1993) 

Sc — So 



/u(s|5o, So) = 



-(«c-«o)V2(s-So) 



2^{s~ 5o)3/2 



(3) 



(the subscript in fu standing for "uncorrelated"), where 
Sc > So and s > Sq. The corresponding unconditional 
distribution is s/u(s) = {2n)~^^^i'e~'^ where = 
S'c/s. In this case, setting So = and expanding around 
So = leads to (Mo & White 1996; Mo, Jing & White 
1997) 



fu{s\So,So = 0) 
Ms) 



(4) 



with the bias coefficients given by 



where Hm{x) 



{—d/dx 



(5) 

are the "proba- 



bilist's" Hermite polynomials. For example, n = 1 re- 
turns the familiar expression for the linear halo bias 
6" = {v^ — l)/i5c. Note that these coefficients are pure 
numbers, independent of wavenumber k, and (by defini- 
tion) of So. It is these scale-independent numbers which 
are most often used to derive cosmological constraints. 
(Of course, for non-negligible So, the Taylor series ex- 
pansion of equation ((2]) will yield bias coefficients that 
depend on So, but this dependence is almost never cal- 
culated or used.) Since the So — >■ limit of equation Q 
corresponds to setting Sc ^ Sc — Sq in the unconditional 
crossing distribution, these 6„ are simply related to the 
nth derivative of sfu(,s) with respect to Sc. This makes 
it easy to see why the Hermite polynomials feature so 
prominently in much of what follows. 

2.2 A weighted-average definition of bias 

If we ignore the fact that the conditional distribution in 
equation Q should really have So < Sc, then it is easy to 
check that 



/u(s) 



where pa{So; So) is a Gaussian distribution with zero 
mean and variance Sq. Although this result is formally 
correct, we argue in the next subsection that the appro- 
priate distribution over which to average should not be a 
Gaussian (nor even a Gaussian chopped at Sq > Sc). But 
if we continue to ignore this detail, then we find 

dSoPGiSo; So) ^''^f°'f°h o = So 

(Desjacques et al. 2010), and more generally, the orthogo- 
nality of the Hermite polynomials implies (Appendix lAip 
that 



So"/' \ Ms) 



Ms\3o,So) 



Hn 



Sg 
So 



(7) 



dSo fu{s\So, So)pg{So; So). 



(6) 



This exact result is remarkable because the left hand side 
involves quantities for an arbitrary So, whereas the right 
hand side, which is independent of So, is simply the So — >■ 
limit of the appropriate bias coefficient. This is not at 
all obvious if one had viewed the local bias expansion as 
a formal Taylor series: one would naively have thought 
that, at the very least, the cross correlation { (1 + 5h)5o ) 
should involve the bias coefficients of all (odd) orders (for 
a further discussion, see Frusciante & Sheth 2012). 

Strictly speaking, this is only a mathematical curi- 
ousity, since the conditional distribution /u(s|5o,So) is 
formally zero for So > Sc, but the identity above holds 
only when (incorrectly) averaging the expression in Q 
over the full (Gaussian) distribution of So. However, if we 
forget for the moment about how the bias factors in equa- 
tion ((5]) were determined, then the analysis above shows 
that the So — >■ limit of the bias coefficients can be recov- 
ered by cross-correlating the halo overdensity field with a 
suitably transformed version of the mass field (the trans- 
formation uses Hermite polynomials). In particular, our 
cross-correlation method works for any smoothing scale 
So; there is no requirement that this scale be large (al- 
though, strictly speaking, one does require that So < s, 
i.e., that the smoothing scale be larger than that used for 
defining the halos in the first place). 

There are two important lessons here. First, treat- 
ing the So — > limit of the bias coefficients as though 
they are arbitrary is risky: one must be careful to ensure 
that the implied conditional distribution function is sen- 
sible (e.g. positive definite). Except for the coefficients 
which come from the more physically motivated excur- 
sion set approach, this is rarely ever done. We return to 
this point in the next subsection. The second lesson is 
that cross-correlating with appropriate transformations 
of the mass field may be an efficient way of isolating the 
different large scale bias coefficients from one another. 
One view of this second lesson is to contrast it with the 
usual probe of higher order bias factors: 2-point statistics 
constrain bi, 3-point statistics constraint both &i and 62, 
and so on (Sefusatti & Scoccimarro 2005; Smith et al. 
2007; Pollack et al. 2012). Since the Hermite polynomials 
here are polynomials, one may think of the transforma- 
tion as picking out that combination of n-point functions 
which isolates the dependence on 6„. The analysis above 
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suggests that, once the appropriate transformation has 
been made, 6„ can be determined by a real-space cross- 
correlation measurement alone, and this cross-correlation 
be made on any smoothing scale So; there is no require- 
ment that this scale be large. 

2.3 Scale dependence from appropriate 
averaging 

The previous subsection noted that a naive averaging of 
( 1 + 5h|5o ) over a Gaussian distribution appeared to re- 
turn the large-scale So — >■ bias factors. However, the 
correct distribution over which to average is not a Gaus- 
sian, but 

g.C^o, So; 5c) = [e-^o/^^° - e-(^^-^o)^/^^o] ^ 

(8) 

where 5o < 5c (Sheth & Lemson 1999). This is be- 
cause gu((5o. So; 5c) gives the probability that the walk 
had height 5o at scale So, and remained below the bar- 
rier 5c on all scales S < So (Chandrasekhar 1943). It is 
easy to check that 

/u(s) = r A5o /u(si5o. So) qu(5o, So; 5c) , (9) 

as it should. 

For similar reasons, whenever one deals with the con- 
ditional mean { 1 -I- 5h|5o ), the appropriate way to com- 
pute cross correlations between the halo overdensity field 
and the mass is by averaging over q and not p, and this 
generically makes the measured coefficients depend on 
scale So as we discuss below. For n = 1, this yields 

1^ ( 5o ( 1 + 5h|5o ) ), = H2(v) + (via + l)erfc {yw/^) 

(10) 

where vIq = i/^ (s/So — 1) (equation 17 in Sheth & Lem- 
son 1999). Note that, in contrast to the previous calcula- 
tion, this quantity yields H2{v) only in the limit So — > 0. 
Similarly, averaging (1-1- Si^)H„ over yields a more 
complicated function of So. 

We have verified these analytical arguments in a 
comparison with numerical results. The symbols in Fig- 
ure [1] show a measurement of the cross-correlation be- 
tween 5o and (1 + 5h) in a Monte Carlo simulation of 
random walks with uncorrelated steps. We generate these 
walks by accumulating independent Gaussian draws, each 
with zero mean and variance (Aa)^. For each such walk, 
we note the scale s at which it first crossed a constant 
barrier 5c, as well as its height 5o at a chosen scale 
So. The Figure shows results for Aa/5c — 0.025 and 
So/5c = 0.25. To measure the correlation in a given bin 
in y = &'i/s, we identify those walks that first cross 5c 
in this bin. If these are Ny in number, we compute the 
mean H„(5oj / y/So) / Ny where 5oj is the height at 

So of the j"^ such walk. Dividing this mean by Sq^^ gives 
the numerical estimate of 6„. Since the first-crossing of 
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Figure 1. Monte-Carlo measurement of the cross-correlation 
between (Sj, and 5o (top) and H2 (bottom), for walks with 
uncorrelated steps. See the main text for a description of the 
measurement. Solid curves show the prediction associated with 
averaging over a Gaussian distribution, as is commonly done, 
and which the main text argued was inappropriate, and dashed 
curves show the result of averaging using q of equation 

5c for these walks is at s > So by construction, this mea- 
surement is a g- averaged one. 

The two panels show the measurements for n — 1 
and 2, and the solid and dashed curves show the ana- 
lytic result of averaging using p and q, respectively. The 
solid curve remains the same for all So (equation [7]) but 
the dashed one does not (e.g. equation llOf) . Therefore, 
the difference between the solid and dashed predictions 
depends on So; we have checked that averaging over qu 
always yields the correct, scale-dependent value. 
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While this agreement demonstrates that we have a 
good understanding of just what it is that the excursion 
set returns, and of the So scale-dependence of the bias 
coefficients returned by a cross-correlation measurement 
- it has also shown that averaging equation ((Sj over a 
Gaussian distribution (rather than q^) will lead to incor- 
rect estimates of the bias factors and their scale depen- 
dence. 

The fact that q ^ p leads to measureable differences 
suggests that unless one has a good model of how both 
/ and q depend on scale, one must use large survey vol- 
umes (to ensure one is safely in the So — > limit) if 
halo bias (e.g. equation |4| is to constrain parameters. 
At the moment, this understanding exists only for the 
special case of predictions based on walks with uncorre- 
lated steps. Unfortunately, for the g-averaging which we 
have argued is the more appropriate, it is not straightfor- 
ward to separate out the scale independent terms Hniy) 
from those which depend on So (e.g., through vio). If 
it were, we would be able to derive cosmological con- 
straints from smaller volumes. As it stands, if walks with 
uncorrelated steps were a realistic model, then for ha- 
los with u ~ 1.3 (mass m ~ lQ^^h~^ Mq or Lagrangian 
scales R ~ 3/i~^Mpc), in order to achieve percent level 
accuracy in predicting the scale-independent 61 (62), one 
would need to work at scales So/Sc — 0.155 (0.115) or 
Lagrangian scales i?o ~ 10/i~^Mpc (14/i~^Mpc). We now 
turn to a study of these issues for the more realistic case 
of walks with correlated steps. 



3 THE EXCURSION SET APPROACH 
WITH CORRELATED STEPS 

We would like to extend the analysis of the previous sec- 
tion to include the effects of correlated steps. To do so, 
we must first set up some notation. 



3.1 Notation 

Let us recall some standard results regarding Gaussian 
distributions, which we will use frequently. If the joint 
distribution p{xi,X2) for two variables is the bivariate 
Gaussian with zero mean, then 



1 ^Tr^-l , 



p{xi,X2) =pg(x;C) = 



^(27r)2Det[C] ' 



(11) 



where C is the covariance matrix dj — {xiXj ). 

If the joint distribution p{xi,X2,xs) for three vari- 
ables is a trivariate Gaussian, then the conditional distri- 
bution p{xi, X2\xs) is also a bivariate Gaussian: 

p(a;i,a;2|a::3) = Pg(x - x; C - c) , (12) 

where the conditional mean x is proportional to X3, 

' {xiX[i) (2:2x3)' 



X = xa 



(13) 



(4) ' (4) 

The "correction" to the covariance matrix c accounts 
for that part of the correlation between xi and X2 



which is due to a correlation with X3. Its components 
' ' -^'^ - { X2X3 f / (xl) and 



are cn = (xixa)'^ / (xl), C22 
C12 = ( X1X3 ) ( X3X2 ) / {xl). 



In the excursion set framework one is interested in 
p[5,5') and p(5, (5'|(5o), where 5' is the "curvature" of the 
walk at scale s, 5' = dS/ds. Since all three quantities 5, 
S' and Jo are essentially linear combinations of the un- 
derlying Gaussian-distributed Fourier modes, both these 
distributions are also Gaussian. In this case {5^) = s, 
{SS') = (l/2)(d/ds) ( 5^ ) = 1/2 and ( (5g ) = So, and the 
relevant quantities read 



C = 



s 1/2 
1/2 {S'^) 



Sx So 

s^T 



s ex/2 
ex/2 el /4s 



and 



where 



■K — 5i 



. Sx Ex \ 



Sx 



[ S5o ) and ex =23- 



[S'So) 
(SSo) 



(14) 
(15) 

(16) 



For a Gaussian filter, W{kR) = exp(-fc^i?V2), 

one has 

Sx = ctqx and ex = o-ix^-o/'^oxO-i, where 



2 



dfcfc3p(fc) 



k 27r2 



k''W{kR)W(kRo) 



If, in addition, P{k) oc fc", then 

Sx 



So 



^2{"+3)/2(l+(5„/^)2/{n+3)^ 

= 2(So/.s) (l + (So/s)2/("+^))"' . 



-{n+3)/2 



(17) 
(18) 

(19) 
(20) 



We will also use the same notation pg{z; a^) to denote a 
one-dimensional Gaussian distribution when there is no 
scope for confusion. 



3.2 The unconditional distribution 

Although our goal is to write down the analogue of equa- 
tion (|3} for the first crossing distribution associated with 
walks which are conditioned to pass through 5o on scale 
So, our first step is to write down the unconstrained dis- 
tribution. As shown by Musso & Sheth (2012), for a con- 
stant barrier of height 5c the latter is well-approximated 
by 



f{s)= d5'S'p{5,,S'), 



(21) 



where p{5c, S') is the bivariate Gaussian Pg(<5c, 5'; C) with 
covariance matrix given in equation (|14p . 

The integral in equation (I21|l can be performed an- 
alytically and leads to 



2 V27r 



1-f erf (ri^/\/2) 



, (22) 
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with 



7 



and 



(1-72) ~ ((52){5'2) 

(Equation [221 corrects a typo in equation 6 of the pub- 
lished version of Musso & Sheth 2012.) For later conve- 
nience, the same can also be written as 



[55'Y 



(23) 



(2vr)(2r) 



(1 + ^) 



where 



27rri/e 



(24) 



(25) 



Musso & Sheth (2012) showed that this approximation 
(as well as its generalisation to moving barriers) works 
extremely well over a large range of scales for a range 
of choices of power spectra and filters (including TopHat 
filtered LCDM) when compared with Monte Carlo solu- 
tions of the first crossing problem. (It breaks down in the 
limit in which the walks must have taken many steps to 
cross the barrier.) Our final analytic results in this paper 
will be valid for arbitrary power spectra and filters for a 
constant barrier. However, since explicit expressions for 
various quantities greatly simplify for the choice of Gaus- 
sian smoothing of power law power spectra, we will show 
comparisons with Monte Carlo solutions for the latter. 
For Gaussian smoothing, 7 = a\/aoa2- 

3.3 The conditional distribution 

Musso & Sheth (2012) argued that the conditional dis- 
tribution corresponding to (|21|l is simply 



A5'5'p{5,,5'\5o) 



(26) 



where p{5c, 5'\5o) is the probability that the walk had a 
height 5c and curvature 5' at scale s, given that it passed 
through So at scale 5*0 < s. In principle, one is really in- 
terested in imposing the stronger condition that the walk 
must have passed through [Sq, So) without having crossed 
5c before ^o- We will return to this point later and argue 
that the effects of ignoring this stronger requirement are 
small. 

The conditional distribution p{5c, 5'\5o) is the bivari- 
ate Gaussian 



p{5c, 5'\5o) = pg(A - x; C - c) , 



(27) 



with A = {5c,S'), C and c given by equation (|14p and the 
conditional mean x given by equation (|15p . For generic 
power spectra and filters, the integral in equation p6p 
can be performed analytically, exactly as in the case of 
equation (|21[) . and expressed in terms of Sx/So and ex. 
The result is 



f{s\So,So) = 



^2¥iQ 
1 + erf {5'/V2&) 



+ 



2-k{5' la) 



(28) 



>■ 0.4 




-0.5 0.5 1 

lg{y = 



1.2 




lg(y = 8//S) 



Figure 2. First crossing distribution of a barrier of height 5c 
by the subset of walks which are conditioned to pass through 
(<5o I -So ) , for a few choices of <5o (as labelled) . Short dashed, 
soUd and long-dashed curves show the analytic prediction from 
equation I I28II for Gaussian smoothing of a Gaussian field with 
P{k) (X 



where 



Sex = 5c- 5o^ 
00 



r , t ^ ^ Sx So 

5cx + Ex -pr 5o — dc— 

00 V So s 



(30) 
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Figure 3. Same as lower panel of Figure |2] but for a larger 
value of |<5o|, and note that now the y-axis is on a log scale. 
The analytic prediction equation II28I I for So = 0.9Sc describes 
the sharp peak in the numerical solution remarkably accu- 
rately. It begins overestimating the numerical solution around 
logiQU^ ~ 0.4, at which point about 75% of the probability 
has been accounted for (see text for why this happens). 



and 



o-^ = Var(5'|5c, 5o) = 



4r2s 



Qs 



C2 



(31) 

Note that, in contrast to equation ((3]), this expression 
for the conditional distribution remains positive definite 
even when So > 5c, ahhough it is understood that only 
So < 5c is sensible. For future reference, the sharp fc-space 
filter has Sx / So ~ 1 and ex = 0; its conditional crossing 
distribution, equation ((24}, corresponds to inserting these 
values in equation (|28[l and replacing the term in square 
brackets with a factor of 2. 



3.4 Comparison with Monte Carlo solution 



Figure [2] compares the prediction in equation (|28p with 
a Monte Carlo solution of the conditional first crossing 
distribution. The comparison is for Gaussian filtered ran- 
dom walks using a power spectrum P{k) cc The 
numerical treatment uses the algorithm of Bond et al. 
(1991) and was described in Paranjape et al. (2012). The 
histograms are the same as in Figure 6 of Paranjape et 
al. and show the distribution of first crossing scales for 
a constant barrier, for walks that were required to pass 
through the indicated values of So at scale So, for two 
choices of So. We see that the analytic prediction works 
very well in describing the numerical solution. 

This good agreement is despite the fact that equa- 
tion (|26|) formally ignores walks which might have crossed 
the barrier prior to So. This can be understood by the 



fact that the values of So being considered in Figure [2] 
are significantly smaller than Sc, so that very few of the 
walks would have reached the barrier prior to So and 
then returned to pass through So at So- One can then 
ask whether the expression in equation (|28p would con- 
tinue to be accurate even for So < Sc, since this is the 
regime of interest for calculations of merger rates. 

We test this in Figure [3l which compares equa- 
tion (|28|) with the Monte Carlo solution for the same 
choice of conditioning scale So as in the lower panel of 
Figure [21 but with a larger magnitude for So which is now 
\So\ = 0.95c- We see that for So = +0.9Sc, the numerical 
solution has a sharp peak which is very well described 
by equation (|28p . The latter starts overestimating the 
numerical answer around logj^g!/'^ ~ 0.4, which can be 
understood as follows. 

Paranjape et al. (2012) demonstrated in their Fig- 
ure 7 that the numerical conditional distributions are, to 
a good approximation, related to the corresponding un- 
conditional one by a simple scaling relation which sends 
V — ^ vio = Scx/\fsQ in the unconditional distribution. 
This is also approximately true of the analytic expres- 
sion in equation p8)l . Since equation (|24)) is not a good 
approximation to the unconditional first crossing distri- 
bution at small values of u (Musso & Sheth 2012), it 
follows that the corresponding analytic conditional dis- 
tribution will not be a good approximation at small v\o. 
One can check that, for the choices of So and So in Fig- 
ure [S] 1^10 actually passes through zero and becomes neg- 
ative around logjQi^^ ~ 0.5. So the mismatch between 
the analytic prediction and the numerical solution is not 
surprising. In practice, J~j'^^°^d\nyyf{y\So) = 0.75, in- 
dicating that the prediction is inaccurate only for the 25% 
which cross at the largest values of s (smallest values of 

y)- 



3.5 Halo bias with correlated steps 

Now that we have in hand a good approximation to the 
conditional first crossing distribution, we can turn to the 
associated description of halo bias. 

The first issue that we would like to address is if Her- 
mite polynomials of the smoothed matter density field 
are still special. Appendix [B] suggests that they are, as 
long as the underlying matter density field is Gaussian. 
More formally, we show there that the role of the Hermite 
polynomial Hn{So/ \fSo) in the average is that of remov- 
ing from it all the disconnected parts, so that only the 
connected part of the expectation value of So remains. 

Secondly, for reasons discussed in section 12.31 in 
principle we must specify the probability distribution 
q{So,So',Sc) to be used in the average. In the present 
case, q is not known analytically. However, in the spirit 
of Musso & Sheth (2012), we can argue that the error 
in ignoring the difference between p and q is of the same 
order as that already included in /(s|5o,So). Indeed, the 
fact that the conditional distributions shown in Figure [2] 
are such an accurate description of the numerical solution 
means that, in this case, the approximation is consistent. 
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Gaussian, n=- 1.2 
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brings these into the form (see Appendix I A2I 



0.5 - 



Figure 4. Distribution of the height 5o on scale So of walks 
which have not crossed <5c prior to So, for a range of choices 
of 5o (histograms), measured in the same Monte-Carlo simu- 
lations which were used to make the Figures [2] and [S] Dashed 
curves show that a Gaussian, truncated at 5q = (5c, provides 
a good approximation. Note that So/5l = 300 X 0.05^ = 3/4 
corresponds to smoothing scales which are of order that asso- 
ciated with a typical halo: therefore, if one restricts attention 
to smaller So, then ignoring the truncation of the Gaussian 
should be a good approximation. 



This is a consequence of the fact that for correlated steps 
zig-zags are exponentially rare at small So; in this limit, 
p ~ q. We can test this explicitly by looking directly at 
the distribution of q in our Monte-Carlos. Figure [4] shows 
that, for S'o values which are smaller than those associ- 
ated with typical halos, the difference between q and the 
Gaussian is almost negligible. 

Motivated by this simplification, let us define the 
real-space bias coefficients associated with the condi- 
tional distribution /(s|5o,So) using 



bn 



n/2 



(l + ^h)//„(5o/^/:%) 

d5oPG('5o; So) { 1 + <5hi5o, So ) i/„(5o/^) , 

(32) 



with (1 + 5h|(5o, So ) given in equation ©. Below we will 
show comparisons between numerical measurements of 
these quantities (q-averaged by construction) with ana- 
lytic results using the p-averaged expression in the second 
line of p2[) . From the discussion above, we expect these 
to match well at least for the smallest So shown in Fig- 
ure [H 

For f{s\5o,So) given by equation (|26l) . some algebra 



(-Sx/So)" 



d 



d 



d5, 



2s dS' 



(33) 

with f{s) given in equation (|2H) . Appendix lA3l shows that 



f{s\5o,So)^Yl 



5x 

"So 



dS'S' 



d ex d , , 



2s dS' 



(34) 



holds exactly for the distribution (|26|l . where A = {5c, 5') 
and the matrices C and c were defined in equation (|14|l . 
Since the bivariate Gaussian pg(A;C) is precisely the 
distribution p{Sc,5') that appears in equation (|33p . we 
clearly have 



/(s|^o,c = 0) 
fis) 



Z — ^ 77I 



(35) 



Setting c = corresponds to the following assignments 
in equation ((28}; 



I'Ig^vv [1- (^o/^c)(Sx/So)(l- 



0]■ 



(36) 



As a result (see Appendix IA4[) the bias coefficients can 
be reduced to: 



S"b„ 



(Ctn + + 7n) , 



where 

an = u"Hn{i^) , n > 1 , 



/3n 



7" = 



1 



l + A 




-^(1-ex) 

(i-ex)"(r/.)"/f„_2(ri.) 

, n = 1 
n > 2, 



n = 1 
n> 2 



(37) 

(38) 
(39) 

(40) 



where A was defined in equation (|25p . 

There are some interesting parallels with the calcu- 
lation for sharp-fc walks, and some important differences. 
There is obviously a close analogy between the So — > 
limit of sharp- fc walks and the c — > limit for correlated 
steps, especially since the matrix c is proportional to So 
(c.f. equation 1141 noting that the factor Sx/So becomes 
constant as So — )■ 0). However, in the present case one is 
not throwing away all the dependence on So, since fac- 
tors of Ex explicitly appear in the expression for the b„. In 
particular, these factors of ex would not have appeared 
if we had simply taken derivatives of the unconditional 
distribution (eauation l24|l with respect to Sc. This has an 
important consequence: for sharp-fc filtering, the quanti- 
ties bn were independent of So, whereas here they depend 
explicitly on So. If we write equation (|37p as 



bn ~ (Sx/So)" E bnk e> 



(41) 



then it is the quantities bnk (rather than 6„) which are 
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Figure 5. Comparison of bias coefficients h\ and b2 of equa- 
tion II37I I (smooth curves) with corresponding measurements 
(points with Poisson errors) in the same Monte Carlo sim- 
ulations used in Figures [2] and |3] for a range of 5o values. 
The measurements were performed as described in section r2. 31 
The analytic prediction clearly tracks both the s- and Sq- 
dependence fairly accurately. There are small systematic dif- 
ferences, especially at large s, which arise because q p at 
large Soi and our analytic approximation to /(s|<5o,S'o) stops 
being a good approximation when s ^ So- 



scale-independent. This will be important below when in- 
terpreting our results in terms of Fourier-space bias. Note 
that the 6„o are the peak-background spht parameters 
f-^-d/dScTf which are of most interest in cosmologi- 
cal applications. This is obvious upon setting tx — in 
equation (|33|1 . 

Since p ~ in contrast to when steps are uncor- 



related, one might expect equation (|33p to be quite ac- 
curate. We test this explicitly in Figure [5] by compar- 
ing the results of evaluating the r.h.s. of equation p3p 
for n = 1 and n = 2 with corresponding measurements 
(performed as described in section 12. 3|) using the same 
Monte Carlo simulations that were used in Figures [2] 
and [3] By construction, the numerically estimated quan- 
tity is g-averaged, whereas the analytic curves show the 
Gaussian-averaged coefficients in equation (|32|) . The an- 
alytic predictions closely track the measurements over a 
range of s-values for several choices of Sq . There are small 
systematic deviations which are likely due to a combina- 
tion of the facts that q 7^ p at large So and that the 
analytic prediction fails to be a good approximation at 
large s. 

Since ignoring the difference between p and g is a 
good approximation, one might wonder if the effect of ex 
can also be ignored; naively one expects the g-averaging 
to be irrelevant at small So/s where ex is also likely to 
be small. Figure [6] shows the results for 61 and 62 for one 
of the choices of So from Figure (5] comparing the same 
measurements as in that figure with analytic expressions 
in which ex is retained as per equations (|39|l and (|40p 
(solid curves) or set to zero by hand (dashed curves). We 
see that the terms involving ex contribute significantly 
and must be retained to get an accurate description of 
the bias. 



3.6 Recovery of scale-independent bias factors 

The bias coefficients in Figure [5] show a strong depen- 
dence on the scale 5*0. This is rather different from the 
case of sharp-fc filtering, for which the b„ recovered from 
p-averaging (equation [7]) were independent of So. Indeed, 
the scale-independence of the recovered 6„ was one of 
our motivations for cross-correlating with the Hermite- 
transformed field in the first place, so it is interesting to 
ask if the dependence on So can be removed. 

This turns out to be possible because of the fol- 
lowing. First, the scale dependence of b„ is almost en- 
tirely due to the factors of Sx and ex (the other effect 
comes from the small difference between p and q averag- 
ing). And secondly, equation (|37p shows that the scale- 
independent bnk are linearly related to each other in such 
a way that measuring 61 , . . . , 6,1 is sufficient to recover all 
the . . . b„k- 

We demonstrate this explicitly for n = 1 and 2. For 
n = 1, we can write 



bi 



_l_Sx 

5c So 
S 



A 



l + A 



+ ex 



A 



l + A 



So 



{bio 



<bii 



Since 



5c bii = u — Scbi 



we can estimate 



Sc bio = 



4(So/Sx)&i 



(42) 



(43) 



(44) 
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Figure 6. Same as Figure[5]for one choice of So, but compar- 
ing the numerical answer for b\ and 62 with the analytic pre- 
diction equation II37I I when the dependence on ex is retained 
(solid red) or set to zero by hand (dashed blue). Clearly, re- 
taining the dependence on ex is important, indicating that 
our method is sensitive to the fc-dependence of bias (see text). 
Blaclc triangles show the result of implementing the recursive 
procedure described in the text for reconstructing the usual (k- 
independent) peak-background split parameters feio and 620 
(dotted curves) from these measurements. Although defined 
at finite So, the procedure works well in reproducing the So- 
independent hnO- 



Similarly, 



(b20 + 2ex 621 + ^\b22) 



(45) 



where the excursion set predictions for the coefficients &2j 
can be read off from equation p7|) . For example, SI 621 = 
v^{A — r^)/(l + A). But, more relevant to the present 



&lb2l = !/^((5c&10 - 1) - ^1^20 

(5c 622 = Slb2o + v'^{v'^ - 25cfeio + 1) . 



(46) 



Hence, 



Sl 620 =- 



5n^\h2 



- eyv'^ [25c J^''! - ex (i^^ - 1) - 2 



(47) 



We have deliberately isolated the peak-background split 
parameters 6„o above. From the structure of the coeffi- 
cients in equation (|37p it is clear that this reconstruction 
can be extended to the higher order coefficients as well. 

The dotted curves in Figure[6]show the analytic pre- 
dictions for 610 and 620 from equation p7p . while the 
triangular symbols show the numerical estimates using 
equations (|44p . (I47p and the corresponding measurements 
of 61 and 62- Clearly, the reconstruction works well. More- 
over, since we are working at finite So, our procedure 
has allowed a simple and direct estimate of the peak- 
background split parameters bno from a measurement of 
scale-dependent bias, without having to access very large 
scales. E.g., the Figure shows results for So = 0.075 5^1 
which corresponds to the scale associated with a « 3.7 
halo and a Lagrangian length scale of 7?o ~ 17/i^^Mpc; 
most other analyses of halo bias are restricted to length 
scales which are several times larger. 

Another way to see this is to notice that, in the ex- 
pressions above, 6„ — >• b„o when ex — >■ 0. Since ex — > on 
large scales, the analysis above shows explicitly that our 
method for reconstructing 6„o works even on the smaller 
scales where ex 7^ 0. Indeed, although we have concen- 
trated on isolating 6„o, the analysis above shows that we 
can isolate the other bnk as well. For example, having 
measured 61 and 62 using our Hermite-weighting scheme, 
and having used equations (|44p and (|47p to estimate 610 
and 620, equation (|46p furnishes estimates of 621 and 622- 

The expressions above show that our method will 
break if ex = 1, which happens when s ^ So- This is 
not surprising since this is the limit in which the large 
scale environment is the same as that on which the halo 
was defined, so our expressions for the conditional dis- 
tribution are becoming ill-defined. Since this regime is 
substantially smaller than the one of most interest in 
cosmology, we conclude that our method allows a sub- 
stantial range of interesting scales to provide estimates 
of the bias factors bnk- 



3.7 Real and Fourier-space bias 

The appearance of ex in the real-space expressions for 6„ 
generically indicates that the bias in Fourier space must 
be fe-dependent. This is most easily seen with bi using a 
Gaussian fifier W{kR) = g-^^'^''/^ 
Suppose that 



5o{k) = 5{k)W{kRo) 



(48) 
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and 

4(k) = 6i(k)5(k)M/(fc7?), (49) 

so that in real space ( 5h|<5o ) = ( ShSo ) /Sq. Then equa- 
tion H42p implies that 

fei(k) = 6io + ^6ii. (50) 

This shows that the excursion set analysis makes a pre- 
diction for how the .Foitner-space coefficients 610 and 611 
should depend on 1/ = 5c/a and F. 

It is remarkable that peaks theory predicts this same 
structure (constant plus k^) for the linear Fourier space 
bias factor (Desjacques et al. 2010). Although the coef- 
ficients 610 and 611 for peaks differ from that for the ex- 
cursion set halos studied here, the relation (|43|) between 
these coefficients is the same. We have checked explic- 
itly that peaks also satisfy the relationships between the 
second order bias coefficients as shown in equation (|47p 
(although the actual values of 620, 621 and 622 are differ- 
ent), and so we expect this correspondence between the 
fc-dependence of peak and halo-bias will hold for all n. 
Because this correspondence is seen in two very different 
analyses (excursion sets and peaks), there is likely to be 
a deeper reason for its existence. 

We explore this further in Appendix[B]where we dis- 
cuss the relation between our analysis and the work of 
Matsubara (2011) who has argued that fc-dependent bias 
factors are generically associated with nonlocal biasing 
schemes. He provides a number of generic results for such 
nonlocal bias, noting that the Fourier-space structure at 
order n which can be written in terms of what he calls 
renormalized bias coefficients c„(ki, . . . , k„). For peaks 
theory, 

c„ (ki , . . . , k„) = fe„o + b„i ^ + bn2 ^ fc?fc| -I- 

(51) 

In this case, for Gaussian initial conditions, the Hermite- 
weighted averages (with a Gaussian filter as per equa- 
tion |B2j show a structure that is identical to our excur- 
sion set predictions of (|4H) . More generally, our Hermite- 
weighting scheme provides a practical way of measuring 
integrals of Matsubara's renormalized bias coefficients c„ . 

We therefore conclude that our real-space Hermite- 
weighted prescription for measuring halo bias can allow 
us to separate the scale-dependent contribution to bias as 
well as isolate the scale-independent (peak-background 
split) part arising from each order n, which traditional 
Fourier-space measurements cannot do. The specific re- 
sults of our excursion set analysis (e.g., the relations be- 
tween the b„k) are then predictions that can be tested in 
more realistic settings such as A'^-body simulations. But 
this is beyond the scope of the present work. 

4 CONCLUSIONS AND DISCUSSION 

We provided an analytic approximation for the first cross- 
ing distribution for walks with correlated steps which are 



constrained to pass through a specified position (equa- 
tion I28p . and showed that it was accurate (Figure [2|. 
Although this is interesting in its own right, we did not 
explore this further. Rather, we used it to provide a sim- 
ple analytic expression for the large scale halo bias fac- 
tors (equation I37p . showing that, as a result of corre- 
lations between scales, real space measures of halo bias 
are scale dependent (equation 1411 and Figure [5]), but this 
scale dependence is best thought of as arising from k- 
dependent bias in Fourier space (Section 13. 7|) . Although 
we presented comparisons with numerical results for a 
specific choice of filter (Gaussian) and power spectrum 
(P(fc) oc fc"^-^), the results of Musso & Sheth (2012) lead 
us to expect that our analytical results will be equally 
accurate for other filters and power spectra, including 
TopHat filtered ACDM. 

For correlations which arise because of a Gaussian 
smoothing filter, the linear bias factor 61 is a constant 
plus a term which is proportional to k^ . This is a con- 
sequence of the fact that our analysis is based on the 
approximation of Musso & Sheth (2012), which asso- 
ciates halos with places where the height of the smoothed 
field and its first derivative with respect to smoothing 
scale satisfy certain constraints. If constraining the sec- 
ond derivative as well leads to an even more accurate 
model of the first crossing distribution, then this would 
give rise to fc'*-dependence in the bias. It is in this sense 
that fc-dependent halo bias is part and parcel of the 
excursion set approach. Such fe-dependence will lead to 
stochasticity in real space measures of bias (Desjacques 
& Sheth 2010); we have not pursued this further. 

We also provided an algorithm for estimating the 
scale-independent coefficients of the fc-dependent bias 
factors from real space measurements (Section I3.6|l . 
Although the method uses cross-correlations between 
the halo field and suitably transformed versions of the 
smoothed mass field at the same spatial position (equa- 
tion [32]), the bias factors it returns are independent of the 
scale on which this transformation is done (Figure |6]). In 
particular, the coefficient of the fc-independent part of the 
bias which our algorithm returns equals that associated 
with the peak-background split argument, even though 
our algorithm can be applied on scales for which the usual 
formulation of the peak-background split argument does 
not apply. 

For Gaussian fields, the transformation we advocate 
uses the Hermite polynomials. Therefore, our work has an 
interesting connection to Szalay (1988) who noted that, 
instead of defining bias coefficients by writing Sh as a 
Taylor series in So as is usually done, one could have 
chosen to expand the mass field in Hermite polynomials. 
Our analysis shows that this is indeed a fruitful way to 
proceed, even when the bias factors are fc-dependent. 

There are two reasons why this is remarkable. First, 
our analysis shows that, for the excursion set model, the 
coefficients of the expansion in So are the same as those 
for the expansion in Hermite polynomials (equations 1321 
and I35p . There is no reason why this should be true in 
general. And second, Szalay explicitly assumed that halo 
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bias was 'local': 5h was a function of 5o only. For local 
bias, the bias factors are fc-independent; fe-dependent bias 
factors are a signature that the bias is nonlocal (Matsub- 
ara 2011, with the fc-dependence of peak bias discussed 
in Desjacques et al. 2010 being a specific example), so 
it is not a priori obvious that an expansion in Hermite 
polynomials would have been useful. 

In Appendix B we showed why, even for nonlocally 
biased tracers of a Gaussian field, the Hermites are so 
special. For completeness, we also provided an analysis 
of the general case, in which the underlying field is not 
necessarily Gaussian (equation IB18|l . This more general 
analysis may prove useful should it turn out that the 
primordial fiuctuation field was non-Gaussian, or if one 
wishes to describe halo bias with respect to the nonlinear 
Eulerian field rather than with respect to the initial one. 

In the former case, primordial non-Gaussianity is ex- 
pected to be sufficiently weak that the Edgeworth ex- 
pansion can be used to provide insight into the expected 
modifications to halo abundances. Since Hermites play an 
important role in the Edgeworth expansion, it is likely 
that our Hermite-based algorithm for halo bias will be 
useful for constraining /nl- 

Recent work has emphasized the advantages of us- 
ing cross- rather than auto-correlations to estimate halo 
bias (Smith et al. 2007; Pollack et al. 2012). Since H„ 
is an n-th order polynomial in the mass field, one may 
think of our algorithm as an extension of this program: 
it uses two-scale halo-mass cross-correlations at the same 
real-space position to extract information which is usu- 
ally obtained from n-point statistics. However, in addi- 
tion to being simpler, our algorithm is able to estimate 
the bias coefficients on smaller scales than those on which 
the more traditional analyses n-point (Fourier or real- 
space) analyses are performed. So we expect it to find 
use in analyses of halo bias in simulations, and galaxy 
bias in real datasets. 

For example, one can compare our prescription with 
traditional methods of estimating bias in real space, e.g. 
Manera & Gaztanaga (2012). Here, instead of comput- 
ing averages of the matter field centered at locations of 
halos (as is natural in the excursion set approach), one 
explicitly defines a halo field (5h(x) smoothed on a grid 
of cell-size Rq and uses the matter field 5o(x) smoothed 
on the same grid. One then fits a polynomial of the 
type (5h = bo + hi5o 4- 62^0/2 to a scatter plot of (5h 
vs. So using a least squares prescription. This is concep- 
tually the same as approximating the function {5h|<5o) 
(which is most easily seen by considering linear biasing 
of a Gaussian field, for which the statement is exact). 
This can be compared with the excursion set prediction 
( 1 -|- 5h|5o ) = f{s\5o, So)/,f{s), and we see that the coeffi- 
cients obtained from the fit will generically depend on 5*0. 
As Manera & Gaztaiiaga show, one needs to define a grid 
on very large scales (7?o <; 40/i~^Mpc) in order to recover 
scale independent bias coefficients. On the other hand, 
our prescription can in principle operate at much smaller 
scales (c.f. section and remove this scale dependence 
by basically computing weighted integrals of the mean 



relation in the S^-Sq scatter plot. A more detailed com- 
parison with traditional techniques is complicated by the 
fact that we have made predictions for Lagrangian bias 
whereas analyses such as Manera & Gaztafiaga's typi- 
cally work in the final, Eulerian field. We leave such a 
comparison to future work. 

In this context, it is worth noting that our algo- 
rithm is more than just a simple way of estimating the 
nonlinear bias coefficients 6„. For example, there has 
been recent interest in reducing the stochasticity between 
the underlying mass field and that defined by the bi- 
ased tracers (Hamaus et al. 2010; Cai et al. 2011). Some 
of this stochasticity is due to the nonlinear nature of 
the bias (Hamaus et al. 2011). Our demonstration that 
the nonlinear bias factors measure the amplitude of the 
cross-correlation function between the halo field and the 
Hermite-transformed mass field will simplify such analy- 
ses. 
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APPENDIX A: DETAILS OF 
CALCULATIONS 

In this Appendix we sketch the proofs of various identities 
stated in the main text. 



Al Proof of equation (O 

To prove equation ((TJ for sharp-fc walks, it is useful to 
consider the following Fourier transform relations involv- 
ing the Hermite polynomials, which follow from the def- 
inition of the H„: 

— roc T 7 

-y=^H„{x) = / — e i-ik) e 



/ • 1 Nn -fc^ /2 f 

(—tk) e 

J — f 



da; 



2-K 



— ikx —X /2 

e e ' 



H„(x). (Al) 



For the conditional first crossing distribution of equa- 
tion ((3]), we use the relation 

sMs\So, So) = s Pg{So -S,;s- So) . (A2) 

Using yo = So/^^Sq and f — <5c/\/s one can write 
' sUs\5o,So)Hn{5o/VS^>)' 



si dv 



( — ik) e ' 



o27r 
/"ff„+i(i.) 



(A3) 



where the second equality follows from writing the 
Fourier integrals corresponding to the Hermite polyno- 
mial and the Gaussian in (yo — i^\/s/ So), doing the in- 
tegral over yo to give a Dirac delta and using this to 
perform one Fourier-space integral. The third equality 



then follows from equation (lAlf) . Together with s/u(s) = 
{2-k)~^^^v e~" this gives the result. 



A2 Form of bias coefficients in equation l|33[l 

The weighted average of the distribution p6p is 



f{s\5o,SQ)H„{SoHSo) 

= f^'^S'^' f A5opg{5o;So)H„(^-^^ p{5,,S'\5o) . 

(A4) 

The product pg{So\ So)Hn{5o / \fSo) and the bivariate 
Gaussian p(5c,5'\8o) (equation I27p can be expressed in 
terms of their Fourier transforms: i.e., we use equa- 
tion (|A1|I and 



p{Sc,S'\5o) = 



(27r) = 



e ^ 'e 2 ^ I 



(A5) 



with A = (5c, 5') and x, C and c given by 
equation (|15p and (|14p . respectively. The integral 
over So then gives a one-dimensional Dirac delta 
5d {ko — kSx/So — k'exSx /2sSo) where fco, k and k' are 
the Fourier variables corresponding to So, Sc and S' , re- 
spectively. Performing the ko integral gives an expression 
in which the contribution of the "correction" matrix c 
exactly cancels. The result can be expressed as 



fis\So,So)H„{So/VS^) 



"S^, 



and using ( 1 + Sii\So, So ) = /(sl<5o, So)//(s) gives the re- 
sult pS)) . 



A3 Taylor expansion of the conditional first 
crossing distribution in equation (|34|l 



Using equation (|27p and the shorthand notation pa for 
Pg{A; C — c) where A — {5c, 5') and the matrices C and 
c were defined in equation (|14p . straightforward algebra 
shows that 

p{Sc,5'\5o) 

^ ^ (-^oSx/So)"'+" f ^Yf^x dy 

^^5J f^SxX ^(k\ / AV~" 
So) ^^[n){d5cj {2sd5'J 



v-5oV SxY f d d y 



(A7) 



Using this in the definition (|26p proves equation p4p . 
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A4 Explicit expressions for the bias coefficients 
in equation ([57| 

The explicit form of the conditional distribution (|26p in 
the limit c — >■ allows for a more convenient calculation 
of the bias coefficients than computing the derivatives in 
equation (|33[) . Using the relations (|36|) in equation (|28p 
brings the conditional first crossing distribution to the 
form 



s/(s|5o,e = 0) = 



-^^{l-5oSx/So)72 



2rV27r 
" Ayy - 



2% 



(A8) 



where So = 5o/5c and ui_ — Vv{Sx / So){l — ex)- The Tay- 
lor expansion of this expression in powers of So can now 
be used to read off the bias coefficients b„ using equa- 
tion (|35p . The Gaussian multiplying the integral can be 
expanded using the definition of the Hermite polynomials 
Hn{v). The following relations are useful in simplifying 
the integral: 



dz zpg{z — Tv; 1) = 



dzpG{z - Vv; 1) 



^[zpG{z-Vv,l)] 



e~ 








e~ 






72^ 






e" 












e" 





2ti 



(1 + ^) , 

rv' 

nHn-i{Vv), 
Hr^iTiy), (A9) 



where ^ was defined in equation (|25|l . Some manipulation 
then leads to the result quoted in equation p7|) . 



APPENDIX B: RELATION BETWEEN 
MATSUBARA'S RENORMALISED 
COEFFICIENTS AND WEIGHTED 
AVERAGES OF THE MATTER DENSITY 

Matsubara (2011) has argued that fc-dependent bias 
factors are generically associated with nonlocal bias- 
ing schemes and has provided a number of generic re- 
sults for such nonlocal bias. In this appendix we show 
the connection between the "renormalised" coefficients 
c„(ki, . . . , lc„) defined by him in terms of functional 
derivatives of the Fourier-space halo field 5h(k) with re- 
spect to the matter field (5k, 



c„(ki 



,k„) = (27r) 



(27r)» \ SS, 



<S"5h(k) 



(Bl) 

and the real-space weighted averages of the matter den- 
sity field which we discuss in the main text. In particular, 
for Gaussian initial conditions, we show that the Hermite- 
weighted bias coefficients fe„ of equation (|32[) are just the 
integrals of the c„, provided one formally uses the quan- 
tity ph(k) rather than 5h(k) in defining the c„, where 



/9h(x) = 1 -f (5h(x). In this case, 
1 



bn = 



i/2 



(l + 4)i?n(<5o/\A%) 



1 f d^fci d^fc„ 



Pi... PnWl ...Wn 



SSJ (2^)3 (2^)3 

X c,i(ki, . . . ,k„) 



(B2) 



where Pi = P(fci). Wi = W(kiRo) and So = 
{2tt)-^ J d^kP{k)W{kRo f. 

We demonstrate this in section IBll by working in 
Fourier space and explicitly evaluating the integral in the 
second line of equation (|B2|l . In section IB2I we work in 
real space, repeating the calculation in field theoretic lan- 
guage and showing that the bias coefficients can be inter- 
preted as connected expectation values. This real-space 
calculation also shows how one might generalise our re- 
sults to the case when the distribution of the matter field 
is not Gaussian. 



Bl Fourier space calculation 

To prove equation HB2|) . note that in the definition (|B1|I . 
the functional derivatives can be transferred to the prob- 
ability density functional (which we denote as Pf^k]), 



5"Ph(k) 



SSu 



.(55k 



2?[5k]P[<5k] 



<5>h(k) 



S& 



ki 



.SS. 



= i-irj 0[(5kl^^^^^M_ph(k), (B3) 

where J I5[5kl denotes a functional integral. Also, statis- 
tical homogeneity allows us to introduce 1 — e''''+''i "^'^ 
where ki...„ = ki + . . . + k,i and hence write the second 
line of (lB2l) as 



V[S^ 



e ph(k) 



(27r 



d^fc. 



(27r)3 ■ ■ ■ (27r)3 



— e ^ 



x^-::^"(-ir(2.f"A...p„ ^"^[^-^ 



^0 



(S(5k 



5(Sk 



(B4) 



For Gaussian initial conditions, 'P[(5k] oc 
exp [-(1/2) / d3fc(5k(5k/((27r)3p(fc))]. The functional 
derivative of Pf^k] can then be understood as follows 
(see also Matsubara 1995). Consider the action of a 
single functional derivative S/SS]^.. When this acts 
on the distribution Pf^k], it brings down a factor 
(-l)5k^(27r)"3p(fc^)-i. On the other hand, when it 
acts on an existing factor of S^.. , it gives a Dirac 
delta Sr>(ki + kj) (since S^. = (5-kj). The result of n 
derivatives on ^[ik] can be organised as an alternating 
sum over terms containing an increasing number of 
Dirac deltas or connections between pairs of vectors 
ki, kj. The alternation arises because each connected 
pair carries a minus sign. For the n-th derivative, the 
term containing p connected pairs (when multiplied by 
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(-l)"(27r)^"Pi . . . P„) looks like 



i-iy 



X P„-2p+2 5D(k„_2p+l + k,i_2p+2) 

X Pn 5D(k„_i + k,i) + perms. 



(B5) 



where "perms." indicates all permutations of the vectors 
kj . Since we integrate over all the kj with a totally sym- 
metric prefactor e*^ "'^(Wi . . . Wn), all these permuta- 
tions lead to identical contributions. 

The product of (27r)='"Pi . . . P„ with the n^'^ deriva- 
tive of P[(5k] therefore equals P[(5k] multiplied by 



[n/2] / 



2p 



(2p-l)!!(27r)^''d-k, ...5, 



X Y[ -Pn-2j SoO<-n~2j-l + k„-2j) , 
3 = 



(B6) 



where [n/2] is the floor of n/2 and the combinatorial fac- 
tor counts the number of partitions of n distinct objects 
into (n — 2p) singletons and p pairs, which is precisely the 
coefficient of x"~'^^ in the Hermite polynomial Hn{x). 

On performing the integrals over ki in the term with 
p connected pairs, the factors of Si^. will contribute (n — 
2p) powers of So (x) and the Dirac deltas will contribute 
p powers of So- Further identifying the inverse Fourier 
transform of ph(k) in the first line of (|B4[) . we can write 
the expression in (|B4|) as 



/ 



■D[Su]r[Si,] Ph(x) 



n — 2p 



SI 



i/2 



il + Si,)Hn{So/VSo) 



(B7) 



which completes the proof. 



B2 Real space calculation: bias as connected 
expectation values 

In real space, the statement that ph(k) can be expressed 
in terms of the modes (5k of the matter field translates to 
the generic expansion 



°° 1 /■ 

Ph(x) = ^— / d^yi...d^yfcbfe(x-yi,...,x 

I — n ' 



X 5{yi) ...S{yk) ; 



(B8) 



where the bk are the coefficients of the Taylor expansion 
of ph in powers of 5, 



6fc(x-yi,...,x-yfc) = 



5''Ph(x) 



M(yi)...M(yfe) 



(B9) 



'5(y.)=o 



which are totally symmetric in their arguments. 

Each term of Equation HB8|I can be considered as a 



vertex with k legs. The correlation function { ph(x)5o(z) ) 
can be computed using Wick's theorem to isolate the two- 
point correlation functions connecting 5o{z) to any of the 
(5(y-,)'s in the sum, and get 



5Z7fc~nT / d^yi---d'^yfe&fc(x-yi,...,x-yfe) 

fe=i ^ '' 



X {5(yi)...5(yfe_i))(5(yfe)'5o(z)) 



(BIO) 

Since one also has 

5ph(x) 1 f ,3 ,3 

X fofe(x-yi,...,x-yfc_i,x-y) 

x5(yi)...5(yfc-i), (Bll) 



then one obtains 

(ph(x)5o(z))= I d^y^^i^^(5(y)5o(z)). (B12) 

Similarly, in order to compute any "connected" correla- 
tion function { ph(x)5o(zi) . . . (5o(zn) one should retain 
only those terms where each of the n external field is 
connected to any of the internal fields of Equation ()B8|I . 
Since the combinatorial factors generated by the action 
of Wick's theorem are the same as those obtained from 
differentiation, one gets 

(ph(x)5o(zi) . ..So{z„) )^ 

d yi...d yn' 



55(yi)...<5(y„) 

n 



Going to Fourier space one has {5(y)5o(z)) 
(27r)-3/d3fce'''(y-"'P(fc)M/(fc7?o) and 5/S5k 
(27r)-='/d^ye*-y(5/(55(y)), so that 

{/9h(x)5o(zi) . ..So{z„) )^ 



I 



d=*fci...d^fe„ Yl [(^'''"''''P{kj)W{kjRo) 



j=i 



^"/Oh(x) 



If we write 



<5"Ph(x) 

SSlr, . . .Sic, 



ss. 



„i{ki + ..-|-k„)-x 



(B14) 



(27r) 



(B15) 

then it is not hard to see that the c„ above agrees with 
Matsubara's definition (equation lBll with <5h — >■ Pb) upon 
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requiring statistical homogeneity, and moreover, 
(ph(x)(5o(zi) . . .(So(z„) )^ 



/ 



p.,3.^.„;#n[«"-'-''^-('.)»'(*,fl.)] 

X c„(ki,...,k„). (B16) 



In other words, the second hne of equation (|B2|) corre- 
sponds to the quantity ( ph(x)(5y (x) ) / Sq , where = 
(5o^(x)>. 

The connected n-point expectation value can be re- 
cursively obtained using 

( Ph^O ) = ( Phl5o )e + ( Ph ) ( ^0 > 

( ) = ( ph<5o' )^ + 2 ( ph5o )e ( <5o ) + ( Ph ){Sl) 
(ph5g) = (ph<5o'>, + 3<Ph<5o>,{5o) 

+ 3{piA),<<5,^) + (ph)(5j|), (B17) 

and in general 



(B18) 

to remove the disconnected contributions from the av- 
erage. Since 5o is Gaussian-distributed, one has (Sg) = 
(r — l)!!S'g''^ for r even and (Sq) — for r odd; writing 
back ( Ph5™ )j, in terms of ( Ph^o" ) in the above expression 
for m < n, one recovers 



;ph'5o )c 



^0 



n/2 



^ {pbHniSo/VS^)) ■ (B19) 



This therefore justifies the interpretation of the bias fac- 
tors b„ as the connected parts of the n-point expectation 
values. 

Moreover, it is clear that the scale dependence of 
(Ph^o )c comes from the presence of the n mixed corre- 
lation functions (d{yj)do{zj)) in Equation (|B13|l . intro- 
ducing n occurrences of the filter W{kjRo) in Equation 
(|B14|) . Therefore one can expect the ratio {phSo)^/S" 
to be approximately scale invariant. 

Similar considerations hold when the distribution 
of the matter field 5 is non-Gaussian. This would in- 
clude both the presence of non-Gaussian initial condi- 
tions and non-linear gravitational evolution. In this case, 
each external field i5o(zi) is connected to ph(x) by the full 
non-Gaussian renormalized propagator, while the coeffi- 
cients c„(ki, . . . , k„) should be defined in terms of what 
in quantum field theory is usually called the 1-PI cor- 
relation function (that is, the sum of all the diagrams 
that cannot be split in two pieces by cutting one single 
line) amputated of the external legs. The bias coefficients 
in this case will not, in general, correspond to Hermite- 
weighted averages, but must be recursively constructed 
using equation (|B18|) (which still involves only 2-point 
measurements). 



